The Anjials of Statistics 
2011, Vol. 39, No. 1, 613-642 
DOI: 10.1214/10-AOS848 

@ Institute of Mathematical Statistics, 2011 



MULTIPLE TESTING VIA FDR^ FOR LARGE-SCALE 
IMAGING DATA 

By Chunming Zhang\ Jianqing Fan^ and Tag Yu 

University of Wisconsin-Madison, Princeton University 
and National University of Singapore 

The multiple testing procedure plays an important role in de- 
tecting the presence of spatial signals for large-scale imaging data. 
Typically, the spatial signals are sparse but clustered. This paper 
provides empirical evidence that for a range of commonly used con- 
trol levels, the conventional FDR procedure can lack the ability to 
detect statistical significance, even if the p- values under the true null 
hypotheses are independent and uniformly distributed; more gener- 
ally, ignoring the neighboring information of spatially structured data 
will tend to diminish the detection effectiveness of the FDR proce- 
dure. This paper first introduces a scalar quantity to characterize the 
extent to which the 'Hack of identification phenomenon" (LIP) of the 
FDR procedure occurs. Second, we propose a new multiple compari- 
son procedure, called FDRl, to accommodate the spatial information 
of neighboring p- values, via a local aggregation of p- values. Theoret- 
ical properties of the FDRl procedure are investigated under weak 
dependence of p-values. It is shown that the FDRl procedure alle- 
viates the LIP of the FDR procedure, thus substantially facilitating 
the selection of more stringent control levels. Simulation evaluations 
indicate that the FDRl procedure improves the detection sensitivity 
of the FDR procedure with little loss in detection specificity. The 
computational simplicity and detection effectiveness of the FDRl 
procedure are illustrated through a real brain fMRI dataset. 

1. Introduction. In many important applications, such as astrophysics, 
satellite measurement and brain imaging, the data are collected at spatial 
grid points, and a large-scale multiple testing procedure is needed for de- 
tecting the presence of spatial signals. For example, functional magnetic 
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resonance imaging (fMRI) is a recent and exciting imaging technique that 
allows investigators to determine which areas of the brain are involved in 
a cognitive task. Since an iMRI dataset contains time-course measurements 
over voxels, the number of which is typically of the order of 10^-10^, a mul- 
tiple testing procedure plays an important role in detecting the regions of 
activation. Another example of important application of multiple testing is 
to the diffusion tensor imaging, which intends to identify brain white matter 
regions [Le Bihan et al. (2001)]. 

In the seminal work, Worsley et al. (2002) proposed a Gaussian random 
field method which approximates the family-wise error rate (FWER) by 
modeling test statistics over the entire brain as a Gaussian random field. 
It has been found to be conservative in some cases [Nichols and Hayasaka 
(2003)]. Nichols and Hayasaka (2003) also discussed the use of permutation 
tests and their simulation studies showed that permutation tests tended 
to be more sensitive in finding activated regions. The false discovery rate 
(FDR) approach has become increasingly popular. The conventional FDR 
procedure offers the advantage of overcoming the conservativeness drawback 
of FWER, requiring fewer assumptions than random field based methods 
and being computationally less intensive than permutation tests. 

Nevertheless, in practical applications to imaging data with a spatial 
structure, even if the p-values corresponding to the true null hypotheses 
are independent and uniformly distributed, the conventional FDR proce- 
dure may lack the ability to detect statistical significance, for a range of 
commonly used control levels a. It will be seen, in the left panels of Figure 
2, that the FDR procedure for a 2D simulated data declares only a couple of 
locations to be significant for a ranging from to about 0.4. That is, even 
if we allow FDR to be controlled at the level 40%, one cannot reasonably 
well identify significant sites. The empirical evidence provided above for the 
standard FDR procedure is not pathological. Indeed, similar phenomena 
arise from commonly used signals plus noise models for imaging data, as 
will be exemplified by extensive studies in Section 4.2. In statistical litera- 
ture, while some useful finite-sample and asymptotic results [Storey, Taylor 
and Siegmund (2004)] have been established for the FDR procedure, the re- 
sults could not directly quantify the loss of power and "/acA; of identification 
phenomenon'^ (LIP)- 

More generally, for spatially structured imaging data, the significant lo- 
cations are typically sparse, but clustered rather than scattered. It is thus 
anticipated that a location and its adjacent neighbors fall in a similar type of 
region, either significant (active) or nonsignificant (inactive). As will be seen 
in the simulation studies (where the LIP does not occur) of Section 5, the 
existing FDR procedure tends to be less effective in detecting significance. 
This lack of detection efficiency is due to the information of p-values from 
adjacent neighbors not having been fully taken into account. Due to the 
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popularity of the FDR procedure in research practices, it is highly desirable 
to embed the spatial information of imaging data into the FDR procedure. 

This paper aims to quantify the LIP and to propose a new multiple testing 
procedure, called FDR^, for imaging data, to accommodate the spatial in- 
formation of neighboring p-values, via a local aggregation of p-values. Main 
results are given in three parts. 

• In the first part, statistical inference for the null distribution of locally 
aggregated p-values is studied. See Method I proposed in Section 3.2 and 
Method II in Section 3.3. 

• In the second part, asymptotic properties of the FDR/, procedure are 
investigated under weak dependence (to be defined in Section 4.1) of p- 
values. See Theorems 4.1-4.3. 

• The third part intends to provide a more in-depth discussion of why the 
LIP occurs and the extent to which the FDR/, procedure alleviates the 
LIP. In particular, we introduce a scalar a^o to quantify the LIP: the 
smaller the Ooo, the smaller control level can be adopted without encoun- 
tering LIP; Ooo = rules out the possibility of the LIP. In the particular 
case of i.i.d. p-values. Theorem 4.4 provides verifiable conditions under 
which Ooo = and under which Uoo > 0. Theorem 4.5 demonstrates that 
under mild conditions, Ooo of the FDR/^ procedure is lower than the coun- 
terpart of the FDR procedure. These theoretical results demonstrate that 
the FDR/, procedure alleviates the extent of the LIP, thus substantially 
facilitates the selection of user-specified control levels. As observed from 
the middle and right panels of Figure 2, for control levels close to zero, the 
FDR/, procedure combined with either Method I or Method II identifies 
a larger number of true significant locations than the FDR procedure. 

The rest of the paper is arranged as follows. Section 2 reviews the conven- 
tional FDR procedure and introduces a^o to characterize the LIP. Section 3 
describes the proposed FDR/, procedure. Its theoretical properties are estab- 
lished in Section 4, where Section 4.2 explores the extent to which the FDR/, 
procedure alleviates LIP. Sections 5 and 6 present simulation comparisons of 
the FDR and FDR/, procedures in 2D and 3D dependent data, respectively. 
Section 7 illustrates the computational simplicity and detection effective- 
ness of the proposed method for a real brain fMRI dataset for detecting 
the regions of activation. Section 8 ends the paper with a brief discussion. 
Technical conditions and detailed proofs are deferred to the Appendix. 

2. FDR and lack of identification phenomenon. 

2.1. Conventional FDK procedure. We begin with a brief overview of the 
conventional FDR procedure that is of particular relevance to the discussion 



4 



C. ZHANG, J. FAN AND T. YU 



in Sections 3 and 4. For testing a family of null hypotheses, {-ffo(^)}r=i5 sup- 
pose that Pi is the p- value of the ith test. Table 1 summarizes the outcomes. 

Benjamini and Hochberg (1995) proposed a procedure that guarantees 
the False Discovery Rate (FDR) to be less than or equal to a pre-selected 
value. Here, the FDR is the expected ratio of the number of incorrectly 
rejected hypotheses to the total number of rejected hypotheses with the ra- 
tio defined to be zero if no hypothesis is rejected, that is, FDR = E{^jj) 
where V 1 = max(i?, 1). A comprehensive overview of the development of 
the research in the area of multiple testing can be found in Benjamini and 
Yekutieh (2001), Genovese and Wasserman (2002), Storey (2002), Dudoit, 
Shaffer and Boldrick (2003), Efron (2004), Storey, Taylor and Siegmund 
(2004), Genovese and Wasserman (2004), Lehmann and Romano (2005), 
Lehmann, Romano and Shaffer (2005), Genovese, Roeder and Wasserman 
(2006), Sarkar (2006), Benjamini and Heller (2007) and Wu (2008), among 
others. Fan, Hall and Yao (2007) addressed the issue on the number of hy- 
potheses that can be simultaneously tested when the p- values are computed 
based on asymptotic approximations. 

Storey, Taylor and Siegmund (2004) gave an empirical process definition 
of FDR, by 

(2.1) FDR«) = £{^^ 

where t stands for a threshold for p- values. For realistic applications. Storey 
(2002) proposed the point estimate of FDR(t) by 

(2.2) FDR(t) = ^ ^, 

^ ^ ^' {i?(t) Vl}(l-A)' 

where A € (0, 1) is a tuning constant, and W{t) is the number of nonrejections 
with a threshold t. The intuition of this will be explained in Section 3.4. The 
pointwise limit of FDR(t) under assumptions (7)-(9) of Storey, Taylor and 
Siegmund (2004) is 

(9 m^^m Ml-go(A)} + vri{l-Gi(A)}]t 

^'•'^ {vroGo(t)+vr,Gi(t)}(l-A) ' 



Table 1 

Outcomes from testing n (null) hypotheses Ho(i) based on a 
significance rule 
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where ttq = liiRn^ooiT'o/n, vri = 1 — ttq, and limn^ooVit)/no = Go{t) and 
lim„_j.oo S(t)/ni = Gi{t) are assumed to exist almost surely for each t G (0, 1]. 
For a pre-chosen level a, a data-driven threshold for p- values is determined 
by 

(2.4) t„(roR) = sup{0<t<l:FDR(t)<a}. 

A null hypothesis is rejected if the corresponding p- value is less than or equal 
to the threshold ^^(FDR). Methods (2.2) and (2.4) form the basis for the 
conventional FDR procedure. 

2.2. Proposed measure for lack of identification phenomenon. Recall that 
the FDR procedure is essentially a threshold-based approach for multiple 
testing problems, where the data-driven threshold ^^(FDR) plays a key role. 
It is clearly seen from (2.4) that ta(FDR) hinges on both the estimates 
FDR(t) devised, as well as the control level a specified. 

Using (2.2), we observe that the corresponding iQ(FDR) is a nondecreas- 
ing function of a. This indicates that for the FDR procedure, as a decreases 
below info<t<i FDR(t), the threshold to, (FDR) will drop to zero and accord- 
ingly, the FDR procedure can only reject those hypotheses with p- values 
exactly equal to zero. We call this phenomenon "/ac/c of identification.^^ 

To better quantify the "ZacA; of identification phenomenon^^ (LIP), the 
limiting forms of FDR(t) as n — t- oo will be examined. 

Definition 1. For estimation methods FDR(t) in (2.2), define 

a™R= inf FDR°°(t), 

°° 0<t<l 

^ — ^oo 

where FDR (t) is defined in (2.3). Define the endurance by -Efdr = 1 — 

Notice that the existence of > implies the occurrence of the LIP: in 
real data applications with a moderately large number n of hypotheses, the 
FDR procedure loses the identification capability when the control level a is 
close to or smaller than a^^. On the other hand, the case ct^^ = rules 
out the possibility of the LIP. Henceforth, the smaller the a^^, the higher 
endurance of the corresponding FDR, and the less likely the LIP happens. 
In other words, an FDR estimation approach with a higher endurance is 
more capable of adopting a smaller control level, thus reducing the extent of 
the LIP problem. We will revisit this issue in Section 4.2 after introducing 
the proposed FDRj;, procedure. 
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3. Proposed FDR^, procedure for imaging data. Consider a set of spa- 
tial signals {/^(u) : f G V C Z'^} in a 2D plane {d = 2) or a 3D space (d = 3), 
where fi{v) = for -y G Vq, /^(f ) 7^ for u G Vi and Vq U Vi = V. Here Vq and 
Vi are unknown sets. A common approach for detecting the presence of the 
spatial signals consists of two stages. In the first stage, test the hypothesis 



at each location v. The corresponding p- value is denoted by p{v). In the 
second stage, a multiple testing procedure, such as the conventional FDR 
procedure, is applied to the collection, {p{v) :f G V C Z''}, of p- values. 

In the second stage, instead of using the original p- value, p{v), at each v, 
we propose to use a local aggregation of p- values at points located adjacent 
to V. We summarize the procedure as follows. 

Step 1. Choose a local neighborhood with size k. 

Step 2. At each grid point v, find the set of its neighborhood points, 
and the set {p{v') : v' G N^,} of the corresponding p-values. 

Step 3. At each grid point v, apply a transformation /: [0, l]'^ 1— t- [0, 1] 
to the set of p- values in Step 2, leading to a "locally aggregated" quantity, 



Step 4. Determine a data-driven threshold for {p*{v) : w G V C Z'^}. 

For notational clarity, we denote by the collection of "locally 

aggregated" p*-values, {p* (u) : G V C Z''} . Likewise, the notation U*{t), 
V*{t), T*{t), S*{t), W*{t) and R*{t) can be defined as in Section 2, with pi 
replaced by p*. For instance, V*{t) = J27=i I{-f^o(^) is true, and p* < t} and 
R*{'t) = X^r=i ^Pi — with l(-) an indicator function. Accordingly, the false 
discovery rate based on utilizing the locally aggregated p*-values becomes 



As a comparison, FDR(t) in (2.1) corresponds to the use of the original 
p- values. 

3.1. Choice of neighborhood and choice of f ■ As in Roweis and Saul 
(2000), the set of neighbors for each data point can be assigned in a variety 
of ways, by choosing the k nearest neighbors in Euclidean distance, by con- 
sidering all data points within a ball of fixed radius or by using some prior 
knowledge. 



Hq{v) : fi{v) = versus Hi{v) : fi{v) ^ 



P*{v)=f{{p{v'):v'eN,}). 



(3.1) 
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For the choice of the transformation function, /, one candidate is the me- 
dian fiher, apphed to the neighborhood p-values, without having to specify 
particular forms of spatial structure. A discussion on other options for / 
can be found in Section 8. Unless otherwise stated, this paper focuses on the 
median filtering. 

3.2. Statistical inference for p* -values: Method I. Let G*{-) be the cumu- 
lative distribution function of a "locally aggregated" p*-value correspond- 
ing to the true null hypothesis. Let G*{-) be the sample distribution of 
{p*{v) : V G Vo}. Recall that the original p- value corresponding to the true 
null hypothesis is uniformly distributed on the interval (0,1). In contrast, 
the distribution G*{-) for a "locally aggregated" p*-value is typically nonuni- 
form. This indicates that a significance rule based on p- values is not directly 
applicable to the significance rule based on p*-values. For the median oper- 
ation /, we propose two methods for estimating G*{-). Method I is particu- 
larly useful for large-scale imaging datasets, whereas Method II is useful for 
data of limited resolution. 

Method I is motivated from the observation: if the original p-values are 
independent and uniformly distributed on the interval (0,1), then the me- 
dian aggregated p*-value follows a Beta distribution. More precisely, if the 
neighborhood size k is an odd integer, then the median aggregated p*-value 
conforms to the 

(3.2) Beta((A; + l)/2,(fc + l)/2) 

distribution [Casella and Berger (1990)]. If k is an even integer, the median 
aggregated p*-value is distributed as a random variable {X + Y)/2, where 
{X, Y) has the joint probability density function k\/{{k/2 — l)!}^x'^/^~^(l — 
y-jfc/2-ij|-Q < x < y < 1). Thus, as long as the resolution of the experiment 
data and imaging technique keeps improving, so that the proportion of 
boundary grid points (corresponding to those with neighborhood intersected 
with both Vo and Vi) decreases and eventually shrinks to zero, G*{-) will 
tend to the Beta distribution in (3.2). 

Following this argument, if the original p- values corresponding to the true 
null hypotheses are independent and uniformly distributed [see, e.g., van der 
Vaart (1998), page 305], the median aggregated p*-values corresponding to 
the true null hypotheses will approximately be symmetrically distributed 
about 0.5. Thus, assuming that the number of false null hypothesis with 
p* > 0.5 is negligible, the total number of true null hypotheses, uq, is ap- 
proximately 2^"^^I(p* > 0.5) + Y17=i^(Pi ~ '-'•^)' number of true 
null hypotheses with j5*-values smaller than or equal to t could be estimated 
by Sr=iI{Pi ^ (1 ~ *)}) for small values of t. Here, owing to the symmetry, 
we use the upper tail to compute the proportion to mitigate the bias caused 
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by the data from the alternative hypotheses. Hence, G*{t) can be estimated 
by the empirical distribution function, 

2Er=ii(pi>o.5)+Er=ii(K=o.5)' 

if 0<t<0.5, 

2Er=ii(K>o.5)+Er=i%*=o.5)' 

if 0.5<t<l. 

A modification of the Glivenko-Cantelli theorem shows that supQ<j<]^ 
G*{t)\ = 0(1) almost surely as n — t- 00. This method is distribution free, com- 
putationally fast and applicable when the p*-values under the null hypothe- 
ses are not too skewedly distributed. _ 

An alternative approach for approximating G*{-) is inspired by the cen- 
tral limit theorem. If the neighborhood size k is reasonably large (e.g., k>5 
if the original p- values corresponding to the true null hypotheses are inde- 
pendent and uniformly distributed), then G*{-) could be approximated by 
a normal distribution centered at 0.5. This normal approximation scheme 
may be exploited in the situation (which rarely occurs, though) when the 
original p- values corresponding to the true null hypotheses are independent 
but asymmetric about 0.5 (when the null distribution function of the test 
statistic is discontinuous). 

3.3. Refined method for estimating G*{-): Method II. More generally, we 
consider spatial image data of limited resolution. Recall the neighborhood 
size of a voxel v in the paper includes one for v itself. Let ni{v) denote the 
number of points in that belong to Vi. Thus for any grid point v GVq, 
r\i{v) takes values {0, 1, . . . , A; — 1}. Set 

9nj = P{ni{v)=j}, Q* (t) = P{p* {v)<t\ni{v)=j}. 

Clearly, Ej=o ^"J ~ ^- Therefore, the C.D.F. of p*{v) for a grid point v £Vo 
is given by 

(3.4) G*{t) = OnfiQm + On,lQl{t) + ■■■+ 0„,fc-lQLi(t), 

where Q^it) corresponds to, for independent tests, the Beta distribution 
function in (3.2). 
Likewise, we obtain 

k-l ^ 

G*{t)=Y,On,,Q]it), 
3=0 



(3.3) G*{t) = < 
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where Onj = ^^Vq^ /uq is the proportion of u G Vq with j neighboring grid 

points in Vi, and Q*{t) = X^^^^Cj) <t}/i^VQ^ is the sample distri- 

bution of {p*{v) : v G Vq }, with denoting the number of elements in 

a set A and Vq"''' = {f G Vq : ni(?;) = j}. Clearly, if the original p-values cor- 
responding to the true null hypotheses are block dependent, then, by the 
Glivenko-Cantelli theorem, supo<j<i \ G*{t) — G*{t) \ = o(l) almost surely, as 
n — oo. _ 
We propose the following Method II to estimate G*{t): 

1. Obtain estimates no and ni = n — uq of uq and ni, respectively. One 
possible estimator of no is no = Y17=i^(Pi ^ ~ G*{X)}, for some 
tuning parameter A. 

2. Define Vi = {v £ V :p*{v) < P*(^n-^-^}, where {p^.^jf^^ denote the order statis- 
tics of {p*}f=i. Define Vo = {v£ V:p*{v) >P(^^)}. 

3. Set V^^'^ = {v£Vo: miv) = j}. Estimate j = 0, . . . , /c - 1, by Onj = 

4. For j = 0, estimate Q^{t) by Q*Q{t) =j!*{t), the estimator of G*{t) by 
Method I in Section 3.2. To estimate Qj{t), j = 1, . . . ,k — 1, for each v G 

Vq'^^ collect its neighborhood p- values, randomly exclude j of them and 
obtain the set Dj{v) for the remaining neighborhood p- values. Randomly 
sample j grid points from Vi and collect their corresponding p-values in 
a set Aj{v). Compute the median, Pj{v), of p- values in Dj{v) U Aj{v). 

Estimate Q*{t) by Q*{t) = Z.^^w Hp*{v) < i}/#Vr. 

5. Combining (3.4), G*{t) is estimated by = ^^^^ 

3.4. Significance rule for p* -values. Using the locally aggregated p*- 
values, we can estimate FDR/,(i) defined in (3.1) by either 

(3,5) Fi5R.«) = '^••'^'^•"1 . 

{R*{t)yi}{l-G*iX)} 

using Method I, or 

w*ix)Gm 



(3.6) FDRL(t) 



{R*{t)Vl}{l-GiiX)} 



using Method II. The logic behind this estimate is the following. If we choose 
A far enough from zero, then the number of nonrejections, W*{X), is roughly 
U*{X). Using this, we have 

V*{X) « noG*{X) ^ {V*{X) + W*{X)}G*{X). 



10 



C. ZHANG, J. FAN AND T. YU 



Solving the above equation suggests an estimate of V*{X) by W*{X)G*{X)/{1 — 
G*{X)}. Now, using V*{t)/V*{X) w G*{t)/G*{X), we obtain that at a thresh- 
old t, V*{t) can be estimated by W*{X)G*{t)/{l - G*{X)}. This together 
with the definition of FDRi(t) in (3.1) suggests the estimate in (3.5). In- 
terestingly, in the particular case of p* = pi and G*{t) = t [or G*{t) = t], 
FDRi(t) coincides with FDR(t) defined in (2.2). 

For a given control level a, a null hypothesis is rejected if the associated 
p* -value is smaller than or equal to the threshold, 

(3.7) t„(FDRL) = sup{0 < t < 1 : FDRL(t) < a}. 

This data-driven threshold for p*-values together with the point estimation 
method (3.5) [or (3.6)] for the false discovery rates comprises the proposed 
FDRi procedure. 



4. Properties of the FDR^, procedure. 



4.1. Asymptotic behavior. This section explores the asymptotic behav- 
ior of the FDRi procedure under weak dependence of p-values. Technical 
assumptions are given in Condition A in the Appendix, where Conditions 
A1-A3 are similar to assumptions (7)-(9) of Storey, Taylor and Siegmund 
(2004). Thus the type of dependence in Condition A2 includes finite block 
dependence, and certain mixing dependence. Theorems 4.1-4.3 can be con- 
sidered a generalization of Storey, Taylor and Siegmund (2004) from a single 
p- value to locally aggregating a number k of p- values with A; > 1. 

Theorem 4.1 below reveals that the proposed estimator FDRi(t) controls 
the FDRi(t) simultaneously for all t>6 with 5 > 0, and in turn supplies a 
conservative estimate of FDRi;,(t). 

Theorem 4.1. Assume Condition A in Appendix A. For each 5 > 0, 

. v*(t) ^ 

lim inf <^ FDRL(t) - ^ \ > > 

n^oot>s\ ^ ' R*{t) V 1 J ~ 



and 



lim inf{FDRi(t) -FDRL(t)} > 

n— >cxD t>S 



with probability one. 

To show that the proposed FDRi(t) asymptotically provides a strong 
control of FDRi(t), we define 

an V^.^(f) Mi-gS(A)} + vri{i-Gt(A)}]G-(t) 

^ ' {7roGS(t) + 7riGt(t)}{l-G*-(A)} ' 
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which is the pointwise Umit of FDR/,(t) under Condition A in Appendix A, 
where it is assumed that ttq = Imin^oono/n, and hm„_).oo ^* (t) /?^o = CqCO 
and liuin-^oo S* (t) /ni = exist almost surely for each t G (0,1], and 

G*°°(t)=lim„^oo G*{t). 

Theorem 4.2. Assume Condition A in Appendix A. If there is a t £ 
(0, 1] such that FDR^(t) < a, then limsup„^oo FDRl (t^ (FDRl)) < a. 

Theorem 4.3 states that the random thresholding rule iQ,(FDRL) con- 

— ~ — oo, 

verges to the deterministic rule taiFDK^ ). 

— — —oo 

Theorem 4.3. Assume Condition A in Appendix A. IfFDKj^ (•) has a 
nonzero derivative at the point tQ,(FDRL ) G (0, 1), then lim„_j.oo tQ.(FDRL) = 

— — oo 

to (FDRl ) holds almost surely. 

4.2. Conditions for lack of identification phenomenon. 

Definition 2. For estimation methods FDRL(t) in (3.5) [or (3.6)], de- 
fine 

a™^^= inf FDRL(t), 
°° o<t<i ^ ^ 

— oo 

where FDRl (t) is defined in (4.1). 

Theorem 4.4 establishes conditions under which the LIP does or does 
not take place with the FDR and FDRl procedures. It will be seen that 
the conditions are characterized by the null and alternative distributions of 
the test statistics, without relying on the configuration of the neighborhood 
used in the FDRl procedure. Theorem 4.5 demonstrates that > a™^^ 

under mild conditions, thus the FDRl procedure reduces the extent of the 
LIP. For expository brevity, we assume the test statistics are independent, 
which can be relaxed. 

Theorem 4.4. Let {T{v) it; G V C Z'^} be the set of test statistics for 
testing the presence of the spatial signals {fJ-{v) :f G V C Z*^}. Consider the 
one-sided testing problem, 

(4.2) Ho{v) : fi{v) = versus Hi{v) : fi{v) > 0. 

For j = and j = 1, respectively, assume that T{v), corresponding to the true 
Hj(v), are i.i.d. random variables having a cumulative distribution function 
Fj with a probability density function fj. Assume that the neighborhood size 
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A; > 3 used in the FDR^ procedure is an odd integer and that the propor- 
tion of boundary grid points within Vq shrinks to zero, as n — )• oo , that is, 
lim^^oo #V|S°V"'0 = 1, where V^"^ = {veV: ii{v') = for any v' E iV„}. As- 
sume Condition Al in Appendix A. Let xq = -Fq~^(1) = inf{t : -Fo(0 = !}■ 

I. // lim,^,,_ ^ = oo, then oF^^ = and oF^^^ = 0. 
II. //limsup^^3,g_ < oo, then a™^ > and a™^^ > 0. 

Theorem 4.5. Assume the conditions in Theorem 4-4- Suppose that 
/o(-) is supported in an interval; /i(x) < fo{x) for any x < Fq~^(0.5); 1 — 
Fo(Ff 1(0.5)) < A < 0.5. Then > a^^^ . 

Corollaries 1 and 2 below provide concrete applications of Theorems 4.4 
and 4.5. The detailed verifications are omitted. 



Corollary 1. Assume the conditions in Theorem 4-4- Suppose that 
the distribution Fq is A^(0, 1) and the distribution Fi is N{C,a'^), where 
a € (0,oo) and C G (0,oo) are constants. 

I. // cr > 1, then Q™^ = and a™^ = 0. 

II. //O < cr < 1, then a™^ > and a™^^ > 0. Moreover, ifexp{-{C/af/2}/ 
a<l andl- Fo{C)<X< 0.5, then a™^ > a™^^ . 

Corollary 2. Assume the conditions in Theorem 4-4- Suppose that the 
distribution Fq is that of a Student's td^ variate with do degrees of freedom 
and the distribution Fi is that of C plus a Student's t^^ variate with di 
degrees of freedom, where C G (0,oo) is a constant. 

I. // do > di, then = and a™^^ = 0. 

II. Ifl<do<di, then q™^ > and a™^^ > 0. Moreover, if do = di 
and l-Fo{C)<X< 0.5, then a™R > ^fdr^ _ 

Remark 1. For illustrative simplicity, a one-sided testing problem (4.2) 
is focused upon. Two-sided testing problems can similarly be treated and 
we omit the details. 

4.3. An illustrative example of > a™^^ > 0. Consider a pixelated 
2D image dataset consisting of n = 50 x 50 pixels, illustrated in the left panel 
of Figure 1 , where the black rectangles represent the true significant regions 
Vi with ni = 0.16 x n pixels and the white background serves as the true 
nonsignificant regions Vq with no = n — ni pixels. The data are simulated 
from the model. 



yihj) = Khj) + ^ihj), j = 1, • • • , 50, 
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where the signals are fJ-iijj) = for G Vq, and niijj) = C for G Vi 
with a constant C G (0,oo), and the error terms {e(i,j)} are i.i.d. follow- 
ing the centered Exp(l) distribution. At each site the observed data 
Y{i,j) is the (shifted) survival time and used as the test statistic for test- 
ing fJ-{i,j) = versus > 0. Clearly, all test statistics corresponding to 
the true null hypotheses are i.i.d. having the probability density function 
/o(x) = exp{— (x + l)}I(x + 1 > 0); likewise, all test statistics in accordance 
with the true alternative hypotheses are i.i.d. having the density function 
fi{x) = exp{— (x + 1 — C)}I(x + 1 > C). It is easily seen that xq = oo, and 
limsup3,^oo /i(x)//o(x) = exp(C) < oo. An appeal to Theorem 4.4 yields 
Q^a?^ > and a™^^ > 0, and thus both the FDR and FDR^, procedures 
will encounter the LIP. Moreover, if C > log(2), exp(— C)/2 < A < 0.5 and 
the neighborhood size A; > 3 is an odd integer, then sufficient conditions in 
Theorem 4.5 are satisfied and hence > a^^^ . 

Actual computations indicate that in this example, as long as C > log (4), 
a™^^ is considerably smaller than a^^, indicating that the FDR^ proce- 
dure can adopt a control level much smaller than that of the conventional 
FDR procedure without excessively encountering the LIP. For example, set 
A = 0.1; assume that the neighborhood in the FDR^ procedure is depicted 
in the right panel of Figure 1, that is, k = 5. Table 2 compares values of 
and a™^^ for C = log(4j), j = 2, . . . ,9. Refer to (C.2) and (C.5) in 
Appendix C for detailed derivations of and a™^^ , respectively. 

To better visualize the LIP from limited data. Figure 2 compares the 
regions detected as significant by the FDR and FDR^, procedures for C = 
log(8) based on one realization of the simulated data. It is observed from 
Figure 2 that for a between and 0.4, the FDR procedure lacks the ability 
to detect statistical significance; as a increases to 0.413 (which is the limit 
= 0.413 as calculated in Table 2) and above, some significant results 
emerge. In contrast, for a close to 0, both Method I and Method II for the 




Fig. 1. Left panel: the true significant regions for the 2D simulated data sets. Right panel: 
neighbors of a point at {x,y) used m the FDR_t procedure for 2D simulated data. 
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Table 2 
Comparing aZ^^ and o^^^ 



c 


log(8) 


log(12) 


log(16) 


log(20) 


log(24) 


log(28) 


log(32) 


log(36) 


<-J-oo 


0.4130 
0.0103 


0.3043 
0.0030 


0.2471 
0.0013 


0.2079 
0.0007 


0.1795 
0.0004 


0.1579 
0.0002 


0.1409 
0.0002 


0.1273 
0.0001 



FDRi procedure are able to deliver some significant results. Similar plots 
to those in Figure 2 are obtained with other choices of C and hence are 
omitted for lack of space. 



5. Simulation study: 2D dependent data. 



5.1. Example 1. To illustrate the distinction between the FDR^, and 
the conventional FDR procedures, we present simulation studies. The true 
significant regions are displayed as two black rectangles in the top left panel 
of Figure 3. The data are generated according to the model 

(5.1) Y{i,j)=^{i,3)+e{i,j), i,j = l,...,258, 



20 
40 



C=log(8), FDR, a = 
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FDR, a = 0.05 



FDR,, Method I, a = 



FDR,, Method II, a = 
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40 



20 
40 
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20 • 
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40 
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FDR,, Method I, a = 0.05 



20 
40 



10 20 30 40 50 
FDR,, Method I, a = 0.4 




40 

10 20 30 40 50 
FDR,, Method I, a = 0.413 
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FDR,, Method II, a = 0.05 



10 20 30 40 50 
FDR,, Method II, a = 0.4 
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FDR,, Method II, a = 0.413 
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Fig. 2. Lack of identification phenomenon when a varies from to a^Jf"' = 0.4130. The 
sites that are called statistically significant based on the realization are shown in black. 
Left panels: the FDR procedure. Middle panels: the FDRi procedure using Method I. Right 
panels: the FDR_t procedure using Method II. 
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where the signals are = for G Vq, fJ'ihj) =4 in the larger 

black rectangle and = 2 in the smaller black rectangle. The errors 

have zero-mean, unit-variance and are spatially dependent, by tak- 
ing £{i,j) = {e{i - 1, j) + e{i,j) + e{i + 1, j) + e{i,j - 1) + e{i,j + l)}/\/5, 
where {e(i, j)}?^?^o i-i-d. A^(0, 1). At each pixel y{i,j) is used as 

the test statistic for testing iJ,{i,j) = against fJ^{i,j) > 0. 

Both FDR and FDR/, procedures are preformed at a common control 
level 0.01, with the tuning constant A = 0.1. In the FDR^ procedure, the 
neighborhood of a point at {x,y) is taken as in the right panel of Figure 1. 
The histogram of the original p- values plotted in Figure 3(a) is flat except 
a sharp rise on the left border. The flatness is explained by the uniform 
distribution of the original p- values corresponding to the true null hypothe- 
ses, whereas the sharp rise is caused by the small p-values corresponding to 
the true alternative hypotheses. The histogram of the median aggregated 
p*-values in Figure 3(c) shows a sharp rise at the left end and has a shape 
symmetric about 0.5. The approximate symmetry arises from the limit dis- 
tribution of p*-values corresponding to the true null hypotheses [see (3.2)], 
whereas the sharp rise is formed by small -values corresponding to the 
true alternative hypotheses. Figures 3(b), (d) and (d') manifest that the 
FDR procedure diminishes the effectiveness in detecting the significant re- 
gions than the FDR/, procedure, demonstrating that the FDR/ procedure 
more effectively increases the true positive rates. As a comparison. Figures 
3(e), (f) and (f) correspond to using the mean (other than median) filter 
for aggregating p-values. It is seen that the detections by the median and 
mean filters are very similar; but compared with the mean, the median bet- 
ter preserves the edge of the larger black rectangle between significant and 
nonsignificant areas. This effect gets more pronounced when a increases, 
lending support to the "edge preservation property" of the median. 
_ To evaluate the performance of Method I and Method II in estimating 
G*{t), the bottom panels of Figure 3 display the plots of G*{t) versus G*{t) 
and G*{t) versus G*{t). The agreement with 45 degree lines well supports 
both estimation methods. 

To examine the overall performance of the estimated FDR(t) and FDRi(t) 
for a same threshold t € [0, 1], we replicate the simulation 100 times. For no- 
tational convenience, denote by FDP(t) = V{t)/{R{t) V 1} and FDPL(t) = 
V*{t)/{R*{t) V 1} the false discovery proportions of the FDR and FDR^ 
procedures, respectively. The average values (over 100 data) of FDR(t) and 
FDRi(t) at each point t are plotted in Figure 4(a). It is clearly observed 
that FDRi(t) using both Methods I and II is below FDR(t), demonstrat- 
ing that the FDR/, procedure produces the estimated false discovery rates 
lower than those of the FDR procedure. Meanwhile, Figure 4 compares the 
average values of FDP(t) and those of FDR(t) in panel (b), and the average 
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true significant regions 



(a) histogram of original p-values 



(b) detection by FDR 
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(c) tiistogram of median p -values 
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(d) detection by FDR , Metfiod I 



60 
100 
150 

200 
250 _ 




50 

100 
150 
200 
250 



50 100 150 200 250 
(f) detection by FDR^, Metfiod I 
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(d') detection by FDR Method II 
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(f) detection by FDR^, Method II 




Fig. 3. Comparison of the FDR and FDRi procedures for Example 1. In the first row, 
left: true significant regions shown in black; middle: histogram of the original p-values; 
right: significant regions detected by the FDR procedure. In the second row, left: histogram 
of the p* -values using the median filter; middle and right: significant regions detected by 
the FDRi procedure using Methods I and 11, respectively. In the third row, left: histogram 
of the p* -values using the mean filter; middle and right: significant regions detected by the 
FDRl procedure using Methods I and 11, respectively. In the bottom row, left: G*{t) versus 

G*{t); right: G^it) versus G*{t); straight line: the 45 degree reference line. Here a = 0.01 
and X = 0.1. 
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(a) (b) 




0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 



(c) (d) 




0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 

( t 



Fig. 4. Panel (a); compare the average values o/ FDR(t) and those o/FDRi(t) using 
Methods I and II. Panel (b); compare the average values o/FDR(t) and those o/FDP(t). 
Panel (c); compare the average values o/FDR_t(f) using Method I and those o/FDP_t(t). 
Panel (d) ; compare the average values of FDRl {t) using Method II and those of FDPi it) . 
Here A = 0.1. 



values of FDPi(i) using Methods I and II and those of FDR/,(t) in panels 
(c) and (d), respectively. For each procedure, the two types of estimates 
are very close to each other, lending support to the estimation procedure in 
Section 3.4. 



5.1.1. Sensitivity and specificity. To further study the relative perfor- 
mance of the FDR and FDR^ procedures, we adopt two widely used per- 
formance measures. 



sensitivity ^ 



S'(ia(FDR))/ni, for the FDR procedure, 

S*{ta, (FDR L ) ) /ni , for the FDRl procedure. 



specificity = { ^ (*"(FDR))/no, for the FDR procedure, 

\ U*{tc{YDRL))/no, for the FDRl procedure, 

for summarizing the discriminatory power of a diagnosis procedure, where 
S{t) = X]r=i I{-^o(^) is false, and Pi<t}, U{t) = J27=i^{^o{i) is true, and 
p^>t}, S*{t) = YJl=i I{^o(i) is false, and p* < t} and U*{t) = ^^^i I{^o(i) 
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Fig. 5. Comparison of the average sensitivity (top panels), average specificity (middle 
panels) and average false discovery proportion (bottom panels). Left panels: A = 0.1. Right 
panels: X — 0.4. 



is true, andp* > t}. Here, the sensitivity and specificity measure the strengths 
for correctly identifying the alternative and the null hypotheses, respectively. 

Following Section 5.1, we randomly generate 100 sets of simulated data 
and perform FDR and FDR^ procedures for each dataset, with the control 
levels a varying from to 0.1. The left panel of Figure 5 corresponds to A = 
0.1, whereas the right panel corresponds to A = 0.4. In either case, we observe 
that the average sensitivity (over the datasets) of the FDRj;, procedure using 
Method I is consistently higher than that of the FDR procedure, whereas 
the average specificities of both procedures approach one and are nearly 
indistinguishable. In addition, the bottom panels indicate that the FDR 
procedure yields larger (average) false discovery proportions than the FDR l 
procedure. It is apparent that the results in Figure 5 are not very sensitive 
to the choice of A. Unless otherwise stated, A = 0.1 will be used throughout 
the rest of the numerical work. 



5.2. Example 2: More strongly correlated case. We consider a dataset 
generated according to the same model (5.1) as in Example 1, but with more 
strongly correlated errors, by taking e{i,j) = X^i=oSj=o^(^'j)/'''' where 
{e{i,j)}fY=o ^-^-d. N{0, 1). As seen from the figure in Zhang, Fan and Yu 
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(2010), both FDR and FDR^ (using Methods I and II) procedures perform 
worse with strongly-correlated data than with low-correlated data (given in 
Figure 3). However, there are no adverse effects by applying FDR^, to more 
strongly correlated data, and Method I continues to be comparable with 
Method II for the FDR/, procedure. 

5.3. Example 3; Large proportion of boundary grid points. The efficacy 
of the FDRi procedure is illustrated in the figure of Zhang, Fan and Yu 
(2010) by a simulated dataset generated according to the same model (5.1) 
as in Example 1, but with a large proportion of boundary grid points, where 
/i(i,j) = for {i,j) G Vq and IJ.{i,j) = 4 for £ Vi. Similar plots using 

l^-ihj) = 2 for (i, j) G Vi are obtained and thus omitted. Again, there is no 
adverse effect of using FDR^ to detect dense or weak signals. 

6. Simulation study: 3D dependent data. We apply the FDR and FDR/, 
procedures to detect activated brain regions of a simulated brain fMRI 
dataset, which is both spatially and temporally correlated. The experiment 
design, timings and size are exactly the same as those of the real fMRI 
dataset in Section 7. The data are generated from a semi-parametric model 
similar to that in Section 5.2 of Zhang and Yu (2008). (They demonstrated 
that the semi-parametric model gains more flexibilities than existing para- 
metric models.) The left panel of Figure 6 contains 9 slices (corresponding 
to the 2D axial view) which highlight two activated brain regions involving 
91 activated brain voxels. The neighborhood used in the FDR/, procedure 
is illustrated in the right panel of Figure 6. 




Fig. 6. Left panel: true activated brain regions (denoted by hot color) for the simulated 
fMRI dataset. Right panel: neighbors of a point at {x,y,z) used m the FDR_l procedure 
for 3D simulated and real data. 
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Fig. 7. Comparison of activated brain regions detected for the simulated fMRI dataset 
using the conventional FDR approach (on the left) and the proposed FDRl procedure (on 
the right) using Method I. Top panels: K. Bottom panels: Kbc- Here a = 0.05. 



Figure 7 compares the activated brain regions identified by the FDR (in 
the left panels) and FDR^ (in the right panels) procedures. Owing to the 
wealth of data, and for purposes of computational simplicity, results us- 
ing Method I of FDR^, are presented. Voxel-wise inactivity is tested with 
the semi-parametric test statistics IC = (^h)^{A(S^.R"^S)~^A'^}~^(^h)/ 
{r^^"^r/(n-rm)} (in the top panels) and Kbc = (Ahbc)^{A(S'^-R"-^S)~^ x 
A^}~^{A\^^c)/{)^J^cR~^^'oc/{'n — rrn)} (in the bottom panels) whose notation 
was given and asymptotic distributions were derived in Zhang and Yu 
(2008). The control level is 0.05. Inspection of Figure 7 reveals that K and 
ICbc locate both active regions. In particular, using the FDR procedure. 
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Fig. 8. Comparison of activated brain regions detected for the simulated fMRI dataset 
using the conventional FDR approach (on the left) and the proposed FDRl procedure (on 
the right) using Method I. Top panels: AFNI. Bottom panels: FSL. Here q = 0.05. 



both methods detect more than 200 voxels (which are visible when zoom- 
ing the images), many of which are falsely discovered. When applying the 
FDRx, procedure, IK detects 82 voxels, whereas Kbc detects 90 voxels. Thus 
the FDRi procedure reduces the number of tiny scattered false findings, 
gaining more accurate detections than the FDR procedure. 

As a comparison, the detection results by popular software AFNI [Cox 
(1996)] and FSL [Smith et al. (2004) and Woohich et al. (2001)] are given in 
Figure 8. We observe that both AFNI and FSL fail to locate one activated 
brain area, and that the other region, though correctly detected, has ap- 
preciably reduced size relative to the actual size. This detection bias is due 
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to the stringent assumptions underlying AFNI and FSL in modeling fMRI 
data: the Hemodynamic Response Function (HRF) in FSL is specified as 
the difference of two gamma functions, and the drift term in AFNI is speci- 
fied as a quadratic polynomial. As anticipated, applying the F distributions 
restricted to parametric models to specify the distributions of test statistics 
in AFNI and FSL leads to bias, which in turn gives biased calculations of 
p-values and j>*-values. In this case, the detection performances of both the 
FDR and FDR^ procedures deteriorate, and the FDRx, procedure does not 
improve the performance of the FDR procedure. See Table 3 for a more 
detailed comparison. 

To reduce modeling bias, for applications to the real fMRI dataset in 
Section 7, we will only employ the semi-parametric test statistics K and ICbc- 
It is also worth distinguishing between the computational aspects associated 
with the FDRl procedure: this paper uses (3.3) for the null distribution of 
p*-values, whereas Zhang and Yu (2008) used the normal approximation 
approach in Section 3.2. 

7. Functional neuroimaging example. In an emotional control study, sub- 
jects saw a series of negative or positive emotional images, and were asked 
to either suppress or enhance their emotional responses to the image, or to 
simply attend to the image. The sequence of trials was randomized. The time 
between successive trials also varied. The size of the whole brain dataset is 
64 X 64 X 30. At each voxel, the time series has 6 runs, each containing 185 
observations with a time resolution of 2 seconds. For details of the dataset, 
please refer to Zhang and Yu (2008). The study aims to estimate the BOLD 
(Blood Oxygenation Level-Dependent) response to each of the trial types 
for 1-18 seconds following the image onset. We analyze the fMRI dataset 
containing one subject. The length of the estimated HRF is set equal to 18. 
Again, the neighborhood used in the FDR^^ procedure is illustrated in the 
right panel of Figure 6. 

A comparison of the activated brain regions using the FDR and FDR^, 
procedures is visualized in Figure 9. The level 0.001 is used to carry out 
the multiple comparisons. The conventional FDR procedure finds more tiny 
scattered active voxels, which are more likely to be falsely discovered. In con- 
trast, the FDRi procedure finds activation in much more clustered regions 
of the brain. 

8. Discussion. This paper proposes the FDR^, procedure to embed the 
structural spatial information of p-values into the conventional FDR proce- 
dure for large-scale imaging data with a spatial structure. This procedure 
provides the standard FDR procedure with the ability to perform better 
on spatially aggregated p-values. Method I and Method II have been devel- 
oped for making statistical inference of the aggregated p-values under the 
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Table 3 

Comparing FDR and FDRl procedures 



Test methods 





Multiple comparison 


K 




AFNI 


FSL 


# of detected voxels 


FDR 


276 


870 


16 


6 




FDRl, Method I 


82 


90 


2 


11 


False discovery proportion 


FDR 


0.6993 


0.9000 


0.5625 







FDRl, Method I 








0.5000 





Sensitivity 


FDR 


0.9121 


0.9560 


0.0769 


0.0659 




FDRl, Method I 


0.9011 


0.9890 


0.0110 


0.1209 


Specificity 


FDR 


0.9921 


0.9678 


0.9996 


1.0000 




FDRl, Method I 


1.0000 


1.0000 


0.9997 


1.0000 



null. Method I gains remarkable computational superiority, particularly for 
large/huge imaging datasets, when the p*-values under the null are not too 
skewed. Furthermore, we provide a better understanding of a "/ac/c of iden- 
tification phenomenon''^ (LIP) occurring in the FDR procedure. This study 
indicates that the FDR^ procedure alleviates the extent of the problem and 
can adopt control levels much smaller than those of the FDR procedure 
without excessively encountering the LIP, thus substantially facilitating the 
selection of more stringent control levels. 

As discussed in Owen (2005) and Leek and Storey (2008), a key issue with 
the dependencies between the hypotheses tests is the inflation of the variance 
of significance measures in FDR-related work. Indeed, similar to FDR, the 
FDRi procedure (using Methods I and II) performs less well with highly- 
correlated data than with the low-correlated data. Detailed investigation of 
the variance of FDR^ will be given in future study. 

Other ways of exploring spatially neighboring information are certainly 
possible in multiple comparison. For example, the median operation applied 
to p-values can be replaced by the averaging, kernel smoothing, "majority 
vote" and edge preserving smoothing techniques [Chu et al. (1998)]. Hence, 
taking the median is not the unique way to aggregate p- values. On the other 
hand, compared with the mean, the median is more robust, computationally 
simpler and does not depend excessively on the spatial co-ordinates, espe- 
cially on the boundaries between significant and nonsignificant regions, as 
observed in Figures 3(d) and (f). An exhaustive comparison is beyond the 
scope of the current paper and we leave this for future research. 
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Fig. 9. Comparison of activated brain regions detected for the real fMRI dataset using 
the conventional FDR approach (on the left) and the proposed FDRl procedure (on the 
right) using Method I. Top panels: K. Bottom panels: Kbc- Here Q! = 0.001. 



APPENDIX A: PROOFS OF THEOREMS 4.1-4.3 

We first impose some technical assumptions, which are not the weakest 
possible. Detailed proofs of Theorems 4.1-4.3 are given in Zhang, Fan and 
Yu (2010). 

Condition A. 

AO. The neighborhood size k is an integer not depending on n. 
Al. liiain^ao no / n = ttq exists and ttq < 1. 

A2. lim„_>.oo V*{t)/no = G5(t) and lim^^oo S*{t)/ni = almost sm'ely 
for each t G (0, 1], where Gq and are continuous functions. 
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A3. < < for each t G (0, 1]. 

A4. sup(g(o,i] \G*{t) — G*°^{t)\ = o(l) almost surely as n — )• oo. 

APPENDIX B: PROOFS OF THEOREMS 4.4 AND 4.5 

B.l. Proof of Theorem 4.4. By the assumptions and Hi{v), we see that 
the p-value has the expression, p{v) = 1 — Fq{T{v)). Thus, the distribution 
function of p{v) corresponding to the true Hq{v) is Go(t) = t for < t < 1 and 

(2.3) gives FDR°°(t) = ''''^''ll^+^G^lyl^'^'' ■ Also, the distribution function 
of p{v) corresponding to the true Hi{v) is given by 
(B.l) Gi(t) = l-Fi(Fo-Hl-t)). 

Likewise, using (3.2), it follows that with probability one, 

V*{t) 



Gl{t) = lim 



n— >cx> 



no 



(B.2) + lim 



lim 7-7 • lim — 



no 



P{p*{v)<t} withuGV, 



(0) 




= G*'^{t) = -B(fc+l)/2,(fc+l)/2(0' 

the cumulative distribution function of a Beta((/c + l)/2, {k + l)/2) random 
variable and 

(B.3) G\{t) = lim S*{t)/ni = %+i)/2,(fc+i)/2(Gi(t)). 

Applying (B.2) and (4.1) gives FUR^ (t) = ""^^^j^+'^jj^li^ilg^.f^'^^ 

— ~-00 

Part I. For the FDR procedure, note that FDR (t) is a decreasing func- 
tion of Gi{t)/t. Applying L'Hospital's rule and the fact limt^o-i- (i) =0, 

(B.4) lim ^ = lim ^■W;"-')> = lim MSl . 

t^o+ t t^Q+ fQ[F^^{l-t)) ^•^^■o-/o(x) 

where x = Fq^{1 — t). Thus, supo<f<i Gi(t)/t = oo, which together with 

FDR (t) shows = for the FDR procedure. 

For the FDR^ procedure, applying (B.2) and (B.3), we get 

(B.5) = = ^^^^-A_^t('=-i)/2(i _ t)(fc-i)/2, 



dt dt [{(A;-l)/2} 
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— oo 

Note that FDR^ (t) is a decreasing function of G*(t)/Go(t). Since 
limt^o+ Gl{t) = and limt_j.o+ Go(t) = 0, 

lin. 9m = dCm/dt 
t^o+ GUt) t^o+ dGl (t)/dt 

(B.7) 

t^o+\ t ' 1-t ) dt ' 

which together with (B.4) shows Hmt_>o+ Gi(t)/Go(t) = oo. Thus, 

sup Glit)/G*oit) = oo, 

0<i<l 

that is, a^^^ = for the FDRj;, procedure. 

Part II. Fohowing FDR [t] and FDR^ (t), we immediately conclude that 
«™R / if 

(B.8) sup Gi(t)/t<oo, 

0<t<l 

and that a™^^ / if 

(B.9) sup Gl{t) / G*Q{t) <oo. 

0<t<l 

We first verify (B.8) for the FDR procedure. Assume (B.8) fails, that is, 
supo<t<i Gi{t)/t = oo. Note that for any S >0, the function Gi{t)/t, for t G 
[5, 1], is continuous and bounded away from oo, thus, supo<f<i Gi(t)/t = oo 
only if there exists a sequence ti > t2 > • • • > 0, such that limm^oo = 
and limm-i-oo Gi{tm)/tm = oo. For each m, recall that both Gi(t) and t are 
continuous on [0,trrt]i and differentiable on (0,tm)- Applying Cauchy's mean- 
value theorem, there exists £ (0,tm) such that Gi{tm)/tm = {Gi{tm) — 
G'i(0)}/(tm - 0) = '^'^j-/*^ \t=u- Since lim^-^oo Gi{tm)/tm = oo, it follows that 

(B.IO) limsup^^i^ = oo. 

i^0+ dt 

On the other hand, the condition limsup^_j.2,p_ < oo indicates that 
(B.ll) hmsup — -— =limsup , Jli ,^ — =limsup— — - < oo, 

where j; = Fo"^(l - t). Clearly, (B.ll) contradicts (B.IO). 

Next, we show (B.9) for the FDR^ procedure. Combining (B.7), (B.8) 
and (B.ll), the result follows. This completes the proof. 
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Lemma 1. Let B(t) be the cumulative distribution function of a Beta(a, a) 
random variable, where a> 1 is a real number. Then I for t E (0, 0.5), B{t)/t 
is a strictly increasing function and B{t) < t; II for t G (0.5, 1), B{t) > t; III 
/or ti G (0,0.5] andt2£[tiA], B{ti)/ti < B{t2)/t2. 

Proof. Let r(-) denote the Gamma function. It is easy to see that 
(B.12) B"{t) = r(2a)/{r(a)}2(a - 1)^-^1 - t)"-2(l - 2t). 

To show part I, define Fi{t) = B{t)/t. Then F[{t) = {B'{t)t - B{t)}/t^, 
where d{B'it)t~B{t)} ^ B"{t)t. For t G (0,0.5), (B.12) indicates B"{t) > 0, 
that is, B'{t)t-B{t) is strictly increasing, implying B'{t)t-B{t) > B'(0)0- 
-B(O) = 0. Hence for t G (0,0.5), B{t)/t is strictly increasing, and therefore 
S(i)/t<S(0.5)/0.5 = l. 

For part II, define F2{t) = B{t) - t. Then F^'(t) = B"{t). By (B.12), 
B"{t) < for t G (0.5, 1), thus F2{t) is strictly concave, giving F2{t) > max{F2(( 
F2(l)} = 0. 

Last, we show part III. For t2 G [ii,0.5], part I indicates that B[ti)/ti < 
B{t2)/t2', for t2 G [0.5, 1], part II indicates that B{t2)/t2 > 1 which, combined 
with B{ti)/ti < 1 from part I, yields B{ti)/ti < B{t2)/t2. □ 

We now prove Theorem 4.5. It suffices to show that 
(B.13) {1 - Gi(A)}/(l - A) > {1 - Gl{\)}/{1 - Gl{\)}, 

(B.14) sup Gi{t)/t< sup G\{t)/Gl{t). 

0<t<l 0<t<l 

Following (B.5) and (B.6), for < t < 1, 
(B.15) G\{t) = Gl{G^{t)). 

Applying (B.15), (B.l), 1 - Fo(Ff ^(0.5)) < A and part II of Lemma 1 yields 
Gi{\) < G\{\); applying A < 0.5 and part I of Lemma 1 implies A > G'q(A). 
This shows (B.13). 

To verify (B.14), let M = supo<t<i Gi(t)/t. Since Gi(l)/1 = 1, we have 
M > 1 which will be discussed in two cases. Case 1: if M = 1, then 

(B.16) sup ^ > N = 1 = sup . 

o<t<i GQ[t) Gq[1) o<t<i t 

Case 2: if M > 1, then there exists to £ [0, 1] and tn G (0, 1) such that 
lim„^ooin = tQ, and 

(B.17) lim Gi{tn)/tn = sup Gi{t)/t = M >1. 
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Thus, there exists A^i such that for ah n> Ni, 

(B.18) Giitn)>tn. 

Cases of to = 1) = and to G (0, 1) wih be discussed separately. First, if 
to = 1, then M = hm„_5.oo Gi{tn)/tn = limn-5>oo Gi(t„) < 1, which contradicts 
(B.17). Thus to < 1. Second, if to = 0, then there exists such that t„ < 0.5 
for aU n> N2. Thus for aU n > = max{A^i, iV2}, applying (B.15), (B.18) 
and part III of Lemma 1, we have that 

Gljt^) Gg(Gi(t„)) ^ 

Gl{tn) Gi{tn) ~ tn 

This together with (B.17) shows 

miQ^ G^t) Glitn) ^ , G'l(tn) Gi(t) 
(B.19) sup — - >lmisup > lim — = M= sup — — . 

0<t<l ^OV''/ n-i-oo (jQ[tn) n^oo t„ 0<t<l ^ 

Third, for to £ (0, 1), since both Fq and Fi are differentiable and /q is sup- 
ported in a sing le interval, Gi(t)/t = {1 - Fi(Fq'^{1 - t))}/t is differentiable 
in (0,1). Thus, 

(B.20) sup Gi(t)/t = Gi(to)/to = M 

0<i<l 

and ^^i^^^|f=to = 0. Notice 



d{Gi{t)/t] 



^ {dGi{t)/dt)\t=to-Gi{to)/to 

t=to *0 



dt 

(B.21) 

{dGi{t)/dt)\t=t,-M ^ 
to 

If to > 0.5, then Fg~^(l - to) < Fq"^(0.5). By (B.4) and the assumption on /o 

and /i, ^^\t=to = MFo\l - to))/fo{F^\l - to)) < 1, which contradicts 
(B.21). Thus, < to < 0.5. This together with (B.15), (B.20), and part III 
of Lemma 1 gives 

Gt(to)_G5(Gi(to))^GS(to) 



G'i(to) Gi(to) - to ■ 
This, together with (B.20), shows 

(B.22) ™p Slfi > SM > ^ = M = sup 

0<i<l GQ(tj GQ(toj to 0<i<l t 

Combining (B.16), (B.19) and (B.22) completes the proof. 
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APPENDIX C: a™^ AND a™^^ IN TABLE 2 OF SECTION 4.3 
Before calculating and a™^^, we first present two lemmas. 

Lemma 2. Let f[x) and g{x) he differentiable functions inx £ (a, b) C R. 
Suppose that g{x) ^ for x G {a,b), and f{x)/g{x) is a nonincreasing func- 
tion of X. For any C G (0,oo) such that g{x) + C ^ 0, if df{x)/dx < dg{x)/dx 
for all X G {a,b), then {f{x) + C}/{g{x) + C} is a decreasing function in 
X G (a, b). 

Proof. The proof is straightforward and is omitted. □ 

Lemma 3. The function h{x) = (10 - Ibe^ x + Qe'^'^ x'^) / {IQ - Ihx + Gx^) 
is decreasing in x £ (0,e~'^), for any constant C G (log(4),oo). 

Proof. The function /i(x) can be rewritten as = {6(— e'^x-|-5/4)^-|- 
5/8}/{6(— x-|-5/4)^ + 5/8}. Note that (— e'^x)/(— x) = e^ is nonincreasing in 
x and e*^ > 1 for X > 0. Applying Lemma 2, (— e'^x + 5/4)/(— x + 5/4) is de- 
creasing in X G (0,e~'^), so is (— e'^x + 5/4)^/(— x + 5/4)^. When C > log(4), 
(i{(-e'^x + 5/4)2 }/(ix < d{{-x + b/ A)'^} / dx. This together with Lemma 2 
verifies that h{x) is decreasing in x G {Q,e~^). □ 



First, we evaluate a^^. From (B.l) and the conditions in Section 4.3, 



(C.l) Go{t)=t for tG [0,1] and G,{t) = l^^f^^ Vtl'f^^'' 



^ — 

Thus supo<i<i Gi{t)/t = e'^ . By FDR (t) in Appendix B, 

FDR TTQ + ^i{l - Ae^I(A < e-^) - I(A > e-^)}/(l - A) 

Next, we compute a™^^. Recall from Appendix B that the distribu- 
tion C^it) with A; = 5 is that of a Beta(3,3) random variable. Similarly, by 
(C.l), the distribution G\{i) is that of a Beta(3, 3)/e'^ random variable. By 

~-oo - — — ~-oo 

FDR^ (t) in Appendix B, FDR^ it) is a decreasing function of G\{t) j G*Q{t) , 
for which two cases need to be discussed. In the first case, t G (0,e-^], it 
follows that 

G\ (t) Gn (t) = e^^ ^ , 

°^ ^ 10-15t + 6t2 

which according to Lemma 3 is a decreasing function of t. Thus, 



sup G\{t)/Gl{t) = \,mGl{t)/Gl{t) = e 



3C 
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and 

(C.3) int Fi5RrW = '°+'""-'^'<^''^.'-^°W'. 

In the second case, t G (e~ , 1], since G\{t) = 1, we observe from FDRj;^ (i) 

— — — oo 

in Appendix B that FDR^ (t) is an increasing function of Go(t), and thus 
(C.4) inf g5i^^(,), ^o + ^.{l-0;(A)}/{l-G5(A)} 

Note that for C > 0, we have 

- <^ ^ = e3^. 



G5(e-^) 6(e-c - 5/4)2 + 5/3 - _ 5/4)2 + 5/3 
Combining (C.3) and (C.4) gives 

(C «FDR, _ 7ro + 7ri{l-Gl(A)}/{l-G*(A)} 

This completes the proof. 
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SUPPLEMENTARY MATERIAL 

Proofs and figures (DOI: 10.1214/10-AOS848SUPP; .pdf). Section 1 gives 
detailed proofs of Theorems 4.1-4.3, Section 2 gives the figure in Section 5.2, 
and Section 3 gives the figure in Section 5.3. 
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